Ruby for the Web
Ruby is the best language in the world for interacting with the web, and I'm going to show you why.
This is a response to Python for the Web from gun.io, just because I cannot stand people use the term "best" without qualifying what aspect they refer to. And yes, "interacting with the web" is spongy enough to include all kinds of things.
In order to honor their "Most rights reserved." footer, I won't actually rewrite their post, but give a succinct counter to each point.
Interacting with Websites and APIs Using Ruby
First we'll handle two simple HTTP requests from the client side. For this we use the excellent REST Client gem, which can be installed via rubygems.
gem install rest-client
require 'rest-client'
puts RestClient.get('http://gnu.io')
require 'rest-client'
puts RestClient.get('https://YOURUSERNAME:PASSWORD@api.github.com/user')
require 'rest-client'
url = 'https://example.com/form'
data = {title: 'RoboCop', description: 'The best movie ever.'}
RestClient.post(url, data)
As before, you can use the basic auth syntax if you require basic or digest authentication.
Processing JSON in Ruby
Since JSON is in the stdlib, there is no need to install anything.
require 'json'
require 'rest-client'
c = RestClient.get('https://github.com/timeline.json')
j = JSON.parse(c)
j.each do |item|
if repository = item['repository']
puts repository['name']
end
end
This also fixes a bug in the original code, as not every item in the timeline has a repository key.
Scraping the Web Using Ruby
Here I'll introduce you to Nokogiri, the binding for libxml2 and libxslt. The usage is heavily influenced by Hpricot and improves upon it in terms of speed, memory usage, accuracy, HTML correction, etc.
There is also a XML library in stdlib called REMXL, but there is not a single time I've used it without regrets.
So first of all install nokogiri, this also requires installation of libxml2 and libxslt on Linux, I have no idea about other systems, but the authors seem to have quite good documentation, so I'll leave the gritty details to them.
Here's how to do it on Arch Linux:
sudo pacman -S libxml2 libxslt
gem install nokogiri
And here's how to use it for HTML in combination with RestClient, although I'd
personally use open-uri
in this case for simplicity.
require 'nokogiri'
require 'rest-client'
tree = Nokogiri::HTML(RestClient.get('http://gun.io'))
tree.css('#frontsubtext').each do |element|
puts element.text
end
Something that wasn't shown is how to use XPATH, since that's quite essential for most HTML and XML juggling, here we go:
require 'nokogiri'
require 'rest-client'
tree = Nokogiri::HTML(RestClient.get('http://gun.io'))
tree.xpath('//a').each do |element|
puts "#{element.text} : #{element[:href]}"
end
Ruby Web Sites
And of course it's about time to plug my own project: Ramaze.
Let's make a little page equivalent of the gun.io example.
I won't go into much detail here, please check out the documentation, as it will answer any questions you have much better than I will be able to do here.
require 'ramaze'
class Home < Ramaze::Controller
map '/'
def index(*input)
@output = input.join('/').upcase
<<-'HTML'
<!DOCTYPE html>
<html>
<head>
<meta encoding="utf-8">
<title>#{@output}</title>
</head>
<body>
Your output is: #{@output}
</body>
</html>
HTML
end
end
Ramaze.start